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ABSTRACT 

Eukaryotic ribosome biogenesis requires the con- 
certed action of numerous ribosome assembly 
factors, for most of which structural and functional 
information is currently lacking. Nob1, which can be 
identified in eukaryotes and archaea, is required for 
the final maturation of the small subunit ribosomal 
RNA in yeast by catalyzing cleavage at site D after 
export of the preribosomal subunit into the cyto- 
plasm. Here, we show that this also holds true for 
Nob1 from the archaeon Pyrococcus horikoshii, 
which efficiently cleaves RNA-substrates containing 
the D-site of the preribosomal RNA in a 
manganese-dependent manner. The structure of 
PhNobl solved by nuclear magnetic resonance 
spectroscopy revealed a PIN domain common with 
many nucleases and a zinc ribbon domain, which 
are structurally connected by a flexible linker. We 
show that amino acid residues required for sub- 
strate binding reside in the PIN domain whereas 
the zinc ribbon domain alone is sufficient to bind 
helix 40 of the small subunit rRNA. This suggests 
that the zinc ribbon domain acts as an anchor 
point for the protein on the nascent subunit pos- 
itioning it in the proximity of the cleavage site. 

INTRODUCTION 

Ribosomes are important macromolecular complexes that 
synthesize proteins in all organisms. Bacterial ribosomes 



have been studied extensively, both structurally and func- 
tionally (1,2), and their subunits can be assembled in vitro 
using purified components (3). Besides ribosomal proteins 
and RNAs, ribosome assembly in bacteria is facilitated by 
a small number of non-ribosomal assembly factors in vivo 
(4,5). In contrast, ribosome biogenesis in eukaryotes 
requires a plethora of cofactors. The process has been 
mainly studied in the yeast Saccharomyces cerevisiae, 
where at least 75 small nucleolar RNAs and more than 
200 proteins are involved (6-8). These non-ribosomal 
proteins include many enzymes and RNA-binding cofac- 
tors such as endonucleases and exonucleases required for 
the processing of preribosomal RNA (pre-rRNA). 

Ribosome biogenesis in yeast is initiated in the nucle- 
olus by the transcription of the 35S pre-rRNA, which 
contains the sequences of the 18S, 5.8S and 25S rRNAs 
(9). The mature rRNAs are generated in a sequence of 
endonucleolytic cleavage and exonucleolytic trimming re- 
actions mediated by different nucleases located in the nu- 
cleolus, nucleus and cytoplasm. After initial cleavages at 
the positions A 0 and A 1; the cleavage at A 2 results in the 
separation of the biogenesis pathways of the small riboso- 
mal subunits (SSU) and large ribosomal subunits (LSU) 
and produces the 20S (SSU) and 27SA 2 (LSU) pre- 
rRNAs. Several biogenesis steps of the LSU occur 
before nuclear export, such as processing steps to 
generate the 6S intermediate and the mature 25S and the 
incorporation of the separately transcribed 5S rRNA, 
while the final trimming of 6S to generate the mature 
5.8S is cytoplasmic (10). 

The cleavage of the 20S pre-rRNA to the mature 18S 
rRNA also occurs after nuclear export to the cytoplasm 
(11). Several protein cofactors have been suggested to be 
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involved in this step in the yeast S. cerevisiae including the 
endonuclease Nobl, the putative kinase Fap7, the RNA 
methyltransferase Diml, the GTPase Ltvl and the RNA 
helicase Prp43 with its cofactor Sqsl. Nobl catalyzes the 
cleavage of the 20S pre-rRNA at site D, while the role of 
Fap7 has remained unclear so far (12,13). The RNA 
helicase Prp43 and its G-patch protein cofactor Sqsl 
have been shown to genetically interact with Nobl and 
Ltvl and are thought to mediate structural rearrange- 
ments that allow efficient cleavage by the endonuclease 
Nobl (13-15). The methyltransferase Diml is required 
for dimethylation of two adenines close to the 3'-end of 
the 18S rRNA (16), while Ltvl is thought to act in 
cofactor release after export of the pre-40S subunits (17). 
Assembly of the head domain of the SSU and integration 
of the ribosomal protein rpS5 has further been shown to 
influence pre-rRNA cleavage at site D (18). 

Nobl was originally identified in yeast as a protein 
associated with the proteasome (19), before its require- 
ment for pre-rRNA processing was discovered (12). It 
was suggested to interact with the pre-SSU RNA in the 
vicinity of site D in yeast (20) and cross-links to helix 40 in 
the 20S pre-rRNA in vivo (21). Based on the analysis of 
sequence similarities, it has been proposed that Nobl 
contains a PilT N-terminus (PIN) domain common to 
many other exonucleases or endonucleases (22,23) and a 
zinc ribbon domain (24). Yeast Nobl has been shown to 
cleave pre-rRNA sequences at site D in vitro (13) and its 
enzymatic function has been suggested to be regulated by 
a conformational switch of the RNA and by Dim2 in vivo 
(15,25). Mutagenesis of residues in the catalytic core of the 
endonuclease, the proposed PIN domain, lead to accumu- 
lation of the 20S pre-rRNA and decrease in levels of 
mature 18S in yeast (26). In general, PIN domain 
proteins have been shown to possess endonucleolytic 
activity, e.g. the exosome component Rrp44 (27,28) and 
the NMD factor Smg6 (29). 

Human Nobl was also found to be involved in 
ribosome biogenesis (30,31). Furthermore, nobl was 
recently identified as one of the six marker genes to dis- 
tinguish the chronic phase from the blast crisis of chronic 
myeloid leukemia (32) and as an oncogenic factor in 
ovarian cancer (33). It was also found to be upregulated 
in noise-injured cochlea and it is suggested to be involved 
in adaptation to an acoustic trauma (34). 

In archaea, many components of the ribosome biogen- 
esis pathway are thought to be conserved and to 
play similar roles in the process as in yeast. Several 
proteins required for 40S biogenesis are conserved in 
archaea, including Nepl, Fap7, Rio2 as well as the 
methyltransferase Diml and its associated protein Dim2. 
Their structural analysis unraveled fundamental principles 
of the molecular action of these factors (35-40), which 
were instrumental for understanding the function of the 
related factors in eukaryotes. Here, we characterize the 
Nobl homolog from the thermophilic archaeon 
Pyrococcus horikoshii (PhNobl). The nuclear magnetic 
resonance (NMR)-solution structure of PhNobl reveals 
the presence of an N-terminal PIN domain and a 
C-terminal zinc ribbon domain connected by a flexible 
linker. In contrast to eukaryotic Nobl proteins, PhNobl 



lacks a large insertion in the PIN domain and a long 
C-terminal tail. In vitro, the full-length protein 
endonucleolytically cleaves RNA substrates that 
resemble the D-site of the P. horikoshii pre-SSU RNA in 
a Mn~ + -dependent manner specifically at the site that cor- 
responds to the mature 3'-end of the 16S rRNA. The zinc 
ribbon domain of PhNobl is sufficient to bind helix 40 of 
P. horikoshii 16S rRNA. Based on the structure of 
PhNobl, we have mutated residues required for efficient 
and selective cleavage of the substrate RNA. These results 
show that both the domain structure and function of 
Nobl are conserved in archaea and eukaryotes. 

MATERIALS AND METHODS 

Sequence search, alignment and phylogenetic analyses 

Sequences were collected from PFAM (PF08772) (41) and 
by PSIBLAST from GenBank (42,43). Multiple sequence 
alignment was calculated with MAFFT v6.846b and a 
maximum likelihood phylogeny was reconstructed with 
RAxML v7 2.6 using gamma-distributed rate heterogen- 
eity and the WAG substitution matrix (44-46). Branch 
support values were calculated with the rapid bootstrap 
method of RAxML (45) from 1000 bootstrap trees. 

Protein and RNA constructs used in this study 

The PhNobl -coding sequence was codon optimized for 
expression in Escherichia coli and the synthetic gene 
(Entelechon) was cloned (Ndel and BamHI) into 
pETlla. Mutants were generated by site directed muta- 
genesis. The PIN domain (amino acids 1-120) and Zinc 
ribbon motif (amino acids 121-161) were cloned into a 
pQE80 derivative generating a HislO-ZZ-TEV-tag (47). 
RNA constructs for in vitro transcription were generated 
in a pGEM4Z vector (Hindlll and EcoRI) and hairpin 
constructs were designed as described (13,28). RNA con- 
structs used for in vitro cleavage assays contained the 
pre-rRNA sequence GGG AGA CAA GCU UAA GUC 
GUA ACA AGG UAG CCG UAG GGG AAC CUA 
CGG CUC GAU CAC CUC CUA UCG CCG GAA 
ACC CCG UCC GGG GGA AUU (PhNlong90), while 
the long and short hairpin loops included the sequences 
AUC ACC UCC UAU CGC C and CGA UCA CCU 
CCU AUC GCC, respectively. RNAs for gel shift 
assays, analytical gel filtration, fluorescence anisotropy 
and NMR-titration experiments were synthesized com- 
mercially (Dharmacon) and deprotected as described by 
the manufacturer. They were folded into monomeric 
species by denaturing them at 95°C for 5min and subse- 
quent dilution with ice-cold water. 

Production and purification of recombinant proteins 

Sample preparation for biochemical analysis. Proteins were 
expressed in 2YT medium for 3h at 37°C. For purifica- 
tion, cells expressing PhNobl were lysed in a buffer con- 
taining 50 mM Tris-HCl pH7.5, 100 mM NaCl, 0.1 mM 
ZnCL and 0.1 mM PMSF, and nucleic acids were 
precipitated with 0.1% polyethylenimine pH 7.0, the 
supernatant was loaded onto SP-Sepharose and PhNobl 
eluted with a linear salt gradient from 100 to lOOOmM 
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NaCl. Peak fractions were collected, pooled and buffer 
exchanged to 50 mM Bis-Tris pH 6.2, 50 mM KC1 for 
NMR or biochemical assays. PhNobl PIN or PhNobl Zn 
were purified using NiNTA (Qiagen) followed by TEV 
cleavage using GST-tagged TEV-protease and removal 
of tags and TEV protease using Glutathion-Sepharose 
and NiNTA or size exclusion on a sephacryl S-100 
column in 50 mM Tris pH 7.5, 120 mM NaCl for 
PhNobl PIN and PhNobl Zn , respectively. 

NMR sample preparation. For the expression of 15 N- 
and/or 15 N- and 13 C-labeled PhNobl, PhNobl PIN or 
PhNobl Zn M9 medium supplemented with 1 g/1 15 NH 4 C1 
and 2.5 g/1 13 C 6 -D-glucose as sole nitrogen and carbon 
source, respectively, was used. 2 H-, 13 C- and 15 N- 
labeling of PhNobl PIN was achieved using E. coli-OD2 
CDN medium (Silantes GmbH) and expression for 6h. 
Proteins were purified as above with subsequent gel filtra- 
tion on a HighPrep Sephacryl SI 00 column on an Akta 
purifier system (GE-Healthcare). NMR samples were 
prepared in 50mM Bis-Tris, 50 mM KC1, pH 6.2, 10 % 
(v/v) heavy water (D 2 0) at protein concentrations of 300- 
400 uM ('%- and b C-labeled PhNobl and PhNobl Zn and 
2 H-, 13 C- and 15 N-labeled PhNobl PIN ) and 100-150 uM 
( 15 N-labeled proteins for titration experiments), respect- 
ively. For titration, PhNlong23: GAU CAC CUC CUA 
UCG CCG GAA AC and PhNshortlO: CCU CCU AUC 
G were used as substrates. 

Endonuclease assays 

Pre-rRNA fragments were transcribed in vitro using T7 
polymerase (Fermentas) and labeled by incorporation of 
radiolabeled nucleotides or subsequent 5'-labeling. In vitro 
nuclease assays were performed as described (13). 
Escherichia coli lysates for RNA degradation controls 
were prepared as described for PhNobl purification and 
used in different dilutions as indicated. Cleavage products 
were separated on a 8% PAA-8M urea sequencing gel and 
analyzed by autoradiography. Sequencing ladders were 
produced using a T7 sequencing kit (USB). 

Structure determination 

NMR-spectroscopy and resonance assignments. NMR- 
spectra were recorded at 37°C on Bruker AVANCE 600, 
700, 800 and 900 MHz spectrometers equipped with cryo- 
genic triple resonance probes. Chemical shifts were 
referenced to an external trimethylsilylpropionate standard 
dissolved in NMR-buffer. The standard set of triple res- 
onance experiments was used for the assignment of 
PhNobl backbone and side chain resonances as described 
elsewhere (48). In addition, the assignments for the 
backbone amide groups of lysines were verified using 
15 N-heteronuclear single quantum correlation (HSQC) 
spectra of an amino acid selectively labeled PhNobl 
sample prepared as described (49). A { H}- and 
I5 N-hetero nuclear nuclear Overhauser effect (HetNOE)- 
experiment (50) for 'H- and 15 N-labeled PhNobl was 
recorded using a Bruker standard pulse sequence at 
37°C with a recycle delay of 5 s and 32 scans per incre- 
ment. All spectra were processed using Bruker TopSpin 
2.1 and analyzed using the program CARA (51). 



Structural constraints. Torsion angle constraints for the 
backbone torsion angles § and v|/ were generated by 
using TALOS+ for the analysis of backbone H, N, Coc, 
CP and CO chemical shifts (52) and used for all residues 
with {/H}- and 15 N-HetNOE values >0.5. A long range 
2D-H(N)CO-experiment with the NC-transfer delay set 
to 133 ms and with 128 scans per increment and 96 
complex points in the indirect CO-dimension was 
recorded for the detection of hydrogen bonds in 
PhNobl Zn using 13 C- and 15 N-labeled protein. Similarly, 
a standard TROSY-3D-HNCO-experiment and a long 
range TROSY-3D-HNCO (53) with the NC-transfer 
delay set to 133 ms (32 scans per increment, 32 and 48 
complex points in the indirect 15 N and 13 C dimensions, 
respectively) were recorded for identifying hydrogen 
bonds in PhNobl PIN using 2 H-, 13 C- and 15 N-labeled 
protein. In both the experiments, appropriate deuterium 
decoupling was used. Hydrogen bonds detected in these 
long range HNCO-experiments on the isolated domains 
were converted into upper limit distance restraints with a 
H(N)-CO distance of 2.3 A and an N-CO distance of 3.2 A 
when the chemical shifts of amide groups involved were 
identical in the full-length protein and the isolated domain. 

Nuclear Overhauser effect (NOE)-based distance re- 
straints were derived from a 3D- 15 N-edited nuclear 
Overhauser effect spectroscopy (NOESY)-HSQC experi- 
ment, 3D- 13 C-edited 3D-NOESY-HSQC experiments with 
the 13 C offset and the transfer delays optimized for either 
the aliphatic or the aromatic carbons recorded in H 2 0 and 
a 3D- 3 C-edited HSQC-NOESY-experiment in D 2 0. 

Structure calculations. The TALOS+-derived torsion 
angle constraints, the hydrogen bond constraints and the 
3D-NOESY peak lists served as input for seven rounds of 
iterative NOE assignments and structure calculations 
using the ATNOS/CANDID module in UNIO (54,55) in 
combination with CYANA (56). This already resulted in a 
well converged bundle of structures with low target func- 
tions. The automatically obtained NOE-derived distance 
constraints were visually inspected and manually cor- 
rected in case of obvious artifacts or misassignments. 
Additional hydrogen bond constraints were introduced 
for amide protons which did not exchange against D 2 0 
after 24 h in an H/D-exchange experiment and which 
formed hydrogen bonds in more than 15 out of the 20 
initial UNIO structures with the lowest target function. 
The resulting edited NOE-derived distance constraints 
were then used together with the hydrogen bond and 
torsion angle constraints described above as input for a 
final round of structure calculation using constrained 
torsion angle molecular dynamics with CYANA (56). 

The zinc atom was introduced in the final structure cal- 
culations via a covalent bond to the sulfur atom of CI 50 
with a bond length of 2.3 A and a bond angle (Zn-S-CP) of 
96°. Distance constraints with upper and lower limits of 
4.1 and 3.6 A between the sulfur atoms of the four cysteine 
residues (C131, C134, C147 and C150) and 2.6 and 2.1 A 
between the zinc atom and the sulfur atoms of the cysteine 
residues were used to enforce the tetrahedral coordination 
of the zinc atom (57). The 20 structures with the lowest 
target function out of the 200 structures calculated in this 
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final CYANA run were chosen to represent the solution 
structure of PhNobl and minimized with the AMBER 
force field as implemented in OPAL (58). All figures of 
structures were prepared using MOLMOL (59) or 
UCSF-Chimera (60). 

Structure comparison 

Structures of PIN (scop:88723) and zinc ribbon domains 
(scop:57783 and scop: 144206) were identified by their 
SCOP (61) classification and downloaded from the protein 
data bank (PDB) (www.pdb.org) (62). Structures were 
loaded in YASARA (www.yasara.org) and superimposed 
with its MUSTANG plugin (63) onto the domains of 
PhNobl to calculate the root mean square deviation and 
sequence (RMSD) similarity. 

Electrophoretic mobility shift assay 

Wild type PhNobl was incubated with radiolabeled 
helix 40, PhNlong23 or PhNlong90 RNA for 30min 
in 50mM Tris/HCl pH 7.5, lOOmM NaCl and ImM 
P-mercaptoethanol at 25°C. Complexes were separated on 
6% native PAGE in lx TBE (89 mM Tris, 89 mM boric 
acid and 2mM ethylenediaminetetraacetic acid) and 
analyzed by phosphoimager. 

Analytical gel filtration 

Analytical gel filtration experiments for the detection of 
RNA-protein interactions were carried out using a 
Superdex S75 10/30 GL (GE Healthcare) gel filtration 
column with a flow rate of 1 ml/min on an Akta purifier 
system (GE-Healthcare) in NMR buffer. As control a 
50 uM sample of the pre-16S rRNA helix 40 mimic of 
P. horikoshii (H40) was applied to the column. Before 
loading the RNA was heated for lOmin and immediately 
cooled down on ice. For RNA-protein interactions, 
protein samples of PhNobl and PhNobl Zn were added 
after the heating step. In all cases, the total applied 
volume was 100 ul. Elution was followed at X l = 280 nm 
and A 2 = 260nm. Protein standards and PhNobl and 
PhNobl Zn samples without RNA were applied to the 
system in a similar way without any heating step. 

Anisotropy measurements 

Fluorescence anisotropy measurements with 5'-fluorescein 
labeled helix 40 (5'-Fl-GACUGCCGGCGAUAAGCCG 
GAGGAAG-3'; Dharmacon), the shortened 5'-fluorescein 
labeled helix 40 (H40S 5'-Fl-CCGGCGAUAAGCCG 
G-3'; Dharmacon) or a stabilized H16 from 5. cerevisiae 
18S rRNA, where all mismatches were changed 
into Watson-Crick base pairs (ScH16 5'-Fl-UACAGGG 
CCCAUUGGGGCCCUGUA-3'; Dharmacon) were 
carried out at 25° C using a Fluorolog 3 (Horiba Jobin 
Yvon, Germany) spectrometer equipped with polarizers. 
Full-length PhNobl, PhNobl PIN and PhNobl Zn were 
titrated to 100 nM fluorescein labeled RNA in lOOmM 
NaCl, 25 mM 4-(2-hydroxyethyl)-l-piperazineetha- 
nesulfonic acid (HEPES) and pH 7.0. Excitation and 
emission wavelength were set to 490 and 518 nm, respect- 
ively. Binding curves for full-length PhNobl and 
PhNobl Zn were analyzed in Sigma Plot (Systat Software 



Inc.) by non-linear least square fitting to the equation: 
R = R 0 + RjcKc+Kj}) (R: anisotropy, R 0 : anisotropy of 
free RNA, R,: increase in anisotropy due to binding, 
c: protein concentration, K D : dissociation constant) con- 
sidering the large excess of protein. 

RESULTS 

Identification of Nobl homologs in archaea 

Saccharomyces cerevisiae Nobl has previously been 
shown to mediate processing of the 20S pre-rRNA at 
site D to generate the mature 18S rRNA (13). 
Remarkably, orthologous sequences of Nobl can be 
identified in mammals, fungi, plants, protists and in 
many archaea (Figure 1A; Supplementary Figure SI and 
Supplementary Table SI) suggesting that the RNA 
cleavage step catalyzed by this enzyme is largely 
conserved. We found Nobl homologs in all of the major 
branches of archaea with the exception of the 
Korarchaeota branch; and even in Nanoarchaeum 
equitans, which is known to contain a highly compressed 
genome due to its parasitic lifestyle (67,68). In total, 82% 
of the sequenced archaeal genomes available in GenBank 
contained a Nobl -like protein as identified by a significant 
hit in a hidden Markov model search. The absence of 
Nobl homologs in Korarchaeota, however, might be a 
result of the small number of genomes available for this 
phylum. 

All identified Nobl homologues apparently contain a 
putative PIN domain with highly conserved aspartate 
residues that are thought to be required for the 
endonucleolytic cleavage activity and a putative zinc 
ribbon domain with four highly conserved cysteine 
residues (Figure IB). However, the archaeal and eukary- 
otic sequences vary significantly with respect to their 
lengths. All eukaryotic sequences contain a large insertion 
in the putative PIN domain and an additional long 
C-terminal tail of unknown structure and function, 
which are lacking in the archaeal proteins. To confirm 
the functional conservation, we analyzed an archaeal 
homolog. After testing a number of archaeal Nobl 
homologs for expression yields, we chose Nobl from the 
thermophilic archaeon P. horikoshii (PhNobl) for further 
studies. PhNobl exhibits a sequence similarity of 68 and 
66% to Nobl from S. cerevisiae and Homo sapiens in the 
aligned regions, respectively. Thus, the P. horikoshii 
protein was expressed and purified without any additional 
tag for in vitro characterization and structure determin- 
ation (Figure 1C and D). 

PhNobl binds Zn 2+ via its zinc ribbon domain 

Based on the presence of four conserved cysteine resi- 
dues in their putative C-terminal Zn-ribbon domain, it 
was proposed that all Nobl proteins bind a Zn 2+ -ion, 
but Zn 2+ binding has not been demonstrated experimen- 
tally so far. Therefore, the zinc content in the different 
purified proteins (Figure 1C and D) was determined to 
explore whether the Zn-ribbon domain coordinates 
this metal. Full-length PhNobl, the Zn-ribbon domain 
(PhNobl Zn ) and the PIN domain (PhNobl PIN ) were 
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Crenarchaeota 

Euryarchaeota Thaumarchaeota 




PhNobl M LRNLKKTLVLDSSVFIQGI 

HsNobl M APVEHWADAGAFLRHA — AL 

ScNobl M TENQT-AHVRALILDATPLITQSYTHY 

AtNobl MDPKPTSMWSSIVKKDPPSKPPVNDGAPAAILGMVGNCKST-KGISIAWDANAIIEGR-QSL 
DpNobl M GTLALTNEK-TTANFLIIDTNALISGV — RL 

PhNobl DIEGYTTPSWEEIKDRESKIFLESLISAGKVKIAEPSKESIDRI IQVAKETGEVNELSK 

HsNobl QDIGKNIYTIREWTEIRDKATRRRLAVL--PYELRFKEPLPEYVRLVTEFSKKTGDYPSLSA 
ScNobl QNYAQSFYTTPTVFQEIKDAQARKNLEIWQSLGTLKLVHPSENSIAKVSTFAKLTGDYSVLSA 
AtNobl TNFADKFVTVPEVLSEIRDPASRRRL-AFI-PFTIDTMEPSPESLSKVIKFARATGDLQSLSD 
DpNobl ETLGKELWTIPEVLEEVKDKKTSDFVTAF--PFEIKTRQPTPESVKAVIDFSMKTGDYPSLSA 



ADIEVLALAYEL 

TDIQVLALTYQLEAE FVGVS-HLKQEPQ-KVKVSSSIQH--PETPLHISGFHLPYKPKPPQET 
NDLHILALTYELEIKLNNGDWRLRKKPGDALDASKADVGTDGKQKLTEDNKKEEDSESVPKKK 

VDLKLIALSYTLEAQVYGTK-NLRDVPPPIQTVRVKRLPEKDLPGWGS 

PDIKLIALAYTFEAEVNGTE-HIKTEPEKIQVTTSKPQQTSINSKSFNPSENNNN 



— EKGHSACEPENLEFSSFMFWRNPLPNIDHELQELLIDRGEDVPSE E 

NKRRGGK KQKAKRE ARE ARE AENANLE LE S KAE 

NVANLEEWEALENETEEKSNANSKILPLKDLNMNIIASDNVSEVGSWSHTE 

NNNNTNKKEKKPKQPK 



PhNobl 

HsNobl EEEEE NGFEDRKD-DS 

ScNobl EHVEEAGSKEQ ICNDEN IKESSDLNEVFED 

AtNobl NHE EDVQE GGKKHRRYPPKKTEIKLEG-KMWEGVDASQGQY 

DpNobl EPKEPKEPKEPKEKNSKKNKLWNLDKIEEEVSKEKWDPNEVLRQLKQKQAEEAEERKKNRT 



PhNobl 
PhNob1 D1 
PhNobl 
PhNobl D100N 
PhNobl R115A 
PhNobl 
PhNobl 



PhNobl 

HsNobl — DDDGGGH 

ScNobl — ADDDGDW 

AtNobl DDDDDASDWRPAVSRSTHSKYLRRKARWEHYNALAEQEIQKDQEADKARHTKEANETHAKDSG 
DpNobl ISKEDEGEW 

PHSD 

AtNobl KNGEDISSILKDMRLEEESLRALQEETEETNAEATLINGEDDIDHDIEVEAEGIDVANQALEN 

PhNobl KGEIFSD 

HsNobl ITPSNIKQIQQEL EQCD-VPED VRVGCLTT 

ScNobl ITPENLTEAI IKDSGEDTTGSLGVEASEEDRHVALNRPENQVALATG 

AtNobl LEIASEAEDTFEASSIGDDG SSEQS HSLRALS E SSVACITG 

DpNobl ITPDNISKVSTTYVQEKVH YDVGCITK 



PhNobl DYNVQNIASLLGLRFRTLKRb 
HsNobl DFAMQNVLLQMGLHVLA-VN 
ScNobl DFAVQNVALQMNLNLMNFMS 
AtNobl DYAMQNVILQMGLRLLA-PG 
DpNobl DFSMQNVILQMGLHLIS-VD 



IKKVIKWRYVCIGCGRK FSTLPPGGVCPDCGSK-VKL 

MLIREARSYI LRCHGCFKT TSD-MSRVFCSHCGNK-TLK 

1LKIKRIRNYMLRCHACFKIEPLPKDGKPKHFCASCGGQGTLL 

'-MQIRQLHRWILKCHACYTV TPE-IGRIFCPKCGNGGTLR 

IVVIKQVKQFVLKCVACLNI TTD-MEKIFCSHCGNK-SLY 



_Q 

O 



£2 
O 



CL 



O 



o 



O 



-D 

O 



O 



PhNobl IPRKR 

HsNobl KVSVTVSD-DGTLHMHFSRNPKVLNPRGLRYSLPTPKGGKY AINPH — LT 

ScNobl RCAVSVDSRTGNVTPHLKSN-FQWNNRGNRYSVASPLSKNSQKRYGKKGHVHSKPQENV-ILR 

AtNobl KVAVT I GA-NGAI I A — ACK- PRI TLRGTQYS I PMPKGGRE AITKNLILR 

DpNobl KATTYVDR-NGNQRVSVGSA-KQFNLRGTIFSIPKPKGGKK HSDMIVT 

PhNobl 

HsNobl EDQRFPQLRLSQKAR QKTNVFA PDYIAGVSPFVENDISSRSATLQVRDSTLG 

ScNobl EDQKEYEKVIKQEEWTRRHNEKILNNWIGGGSADNYI SPFAITGLKQHNVRIGK 

AtNobl EDQLPQKL LHPRTKKKASKPG DEYFVS-DDVFLNHHSDR KAPLQPPV 

DpNobl EEQYIHRLKVTGQYYKKKVSKEINL DDLEMGFGHTGP 

PhNobl 

HsNobl AGRR RLNPN-ASRKKFVKKR 

ScNobl GRYVN SSKRRS 

AtNobl RKAMSVFSQKRNPN DNHYSRSMH — 

DpNobl SDNIVIGYGNKNPNIARKRIGKKNKSISIF 



Figure 1. PhNobl sequence analysis and recombinant protein production. (A) The phylogenetic relation of Nobl was calculated with RAxML v7.2.6 
(64) as described (Supplementary Figure SI and Supplementary Table SI). Dashed lines indicate low bootstrap values (<50%). The tree was 
prepared with MEGA (65) and filled triangles represent compressed subtrees. The triangle length and width are defined by the longest branch in 
a subtree and by the number of sequences, respectively. (B) The alignment of the amino acid sequence of the Nobl protein from P. horikoshii 
(PhNobl. amino acids 1-161), H. sapiens (HsNobl, amino acids 1-412), Arabidopsis thaliana (AtNobl, amino acids 1-602), Dictyostelium purpureum 
(DpNobl, amino acids 1-429) and 5. cerevisiae (ScNobl, amino acids 1 — 459) is shown. (C) The nomenclature, Nobl domain annotation and 
PhNobl constructs used in the manuscript are shown as bar diagrams. (D) The purified proteins used for biochemical characterization and structure 
determination were subjected to Tricin-PAGE (66) followed by Comassie blue staining. 



purified and the zinc content analyzed by inductively 
coupled plasma mass spectrometry. A molar ratio of 
about 1 (protein to zinc) for PhNobl and for PhNobl Zn 
(Table 1) was observed. In contrast, the ratio for 
PhNobl PIN was in the range of > 60-fold protein excess 
over zinc (Table 1). These results clearly indicate that zinc 
is copurified with PhNobl and the zinc-binding domain, 
as the enzyme was not recharged with zinc before 
the analysis. The high affinity of PhNobl for zinc is 
consistent with our observation that bound zinc cannot 



be extracted in vitro by either EDTA or - N,N,N', 
N'-te?raA:w-(2-Pyridylmethyl)ethylenedianiine (TPEN), 
which have reported affinities for Zn 2+ of 10~ 4 -10~ 6 
M, respectively (69). 

Endonucleolytic activity of PhNobl 

Purified PhNobl (Figure 1) was tested in pre-rRNA 
cleavage assays using an in vitro transcript containing a 
pre-rRNA sequence covering the 3'-end of P. horikoshii 
SSU rRNA (16S rRNA), thus resembling site D of 
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Table 1. Zinc determination 



A internal 5'-labeled 



B 5mM - Mn 2+ 



Protein 


MW (kDa) 


Concentration 

Protein Zinc (ng/ul) 
(Hg/Ml) 


Protein:Zinc 
molar ratio 


PhNobl 


17.9 


3.5 


12.4 


1.02 


PhNobl Zn 


4.9 


0.05 


0.74 


0.90 


PhNobl PIN 


13.8 


1.4 


0.11 


59.9 



pre-16S-rRNA. Cleavage of the internally labeled tran- 
script resulted in two products, which occurred in a 
PhNobl concentration-dependent manner (Figure 2A, 
internal). To determine which of them resembles the 
5'-terminal fragment, 5'-end labeled RNA was used, re- 
sulting in the detection of only the longer cleavage 
product, again in a PhNobl concentration-dependent 
manner (Figure 2A, 5'-labeled). This represents the 
5'-fragment expected to result from cleavage at the 
3'-end of the mature 16S rRNA. This was further con- 
firmed by sequencing of PhNobl cleavage products, the 
majority of which were processed at the position previ- 
ously reported to represent the 16S rRNA 3'-end 
(12,13,20,26). The cleavage of the pre-rRNA is strictly 
dependent on the divalent cation manganese which 
cannot be substituted by Mg 2+ -ions (Figure 2B). This 
finding parallels the observations previously made with 
yeast Nobl (13). The observed nuclease activity is not 
the result of RNAse contamination by E. coli RNAses, 
because it is observed even after an extended heat shock 
(lOmin, 75°C), which does not affect the activity of the 
thermophile protein (Supplementary Figure S3). In 
addition, incubation of substrate RNAs with E. coli 
lysates resulted in degradation patterns that did not 
resemble the specific cleavage products observed after 
incubation with purified PhNobl (Supplementary 
Figure S3). 

Specific hairpin constructs were designed to analyze 
whether the fragments obtained are generated by endo- 
nuclease activity or exonucleolytic digestion (Figure 3A) 
(13). These constructs carry sequences of the cleavage site 
in their loop structure and the ends are protected by a 
highly stable secondary structure. Incubation of the 
in vitro transcribed hairpin RNAs with recombinant 
PhNobl resulted in distinct bands for both RNAs of 
sizes expected to result from endonucleolytic cleavage at 
the predicted site (Figure 3B). Thus, we conclude that 
P. horikoshii Nobl is an endonuclease that can process 
pre-rRNA sequences by cleavage at the position corres- 
ponding to the 3'-end of the 16S rRNA. 

PhNobl consists of two structurally independent domains 

To obtain structural insights, we performed NMR experi- 
ments. At first, we assigned the NMR signals of full length 
PhNobl. Isotopically labeled PhNobl was expressed 
and purified and yielded high-quality 'H, 15 N-HSQC 
spectra containing the expected number of backbone 
amide signals (Figure 4A). Essentially complete NMR as- 
signments were obtained for the backbone and side chain 
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Figure 2. Processing of the 3'-end of P. horikoshii 16S rRNA is 
mediated by PhNobl. (A) In vitro transcripts of sequences from the 
3'-end of P. horikoshii 16S rRNA and part of the internal transcribed 
spacer 1 (ITS1) were radiolabeled internally (left) or 5'-labeled (middle) 
and subjected to different concentrations of recombinant PhNobl. 
Cleavage of RNAs was then analyzed by denaturing SDS-PAGE 
followed by phosphorimaging. A sequencing ladder was prepared for 
size comparison (right). Arrows indicate the long (arrow 1) and short 
(arrow 2) cleavage products. Arrow 3 indicates the 3'-end of the 16S 
rRNA. Cleavage of internally labeled substrate results in two labeled 
products (arrows 1 and 2), while the 5'-phosphorylated substrate only 
results in labeling of the long cleavage product (arrow 1). T represents 
transcript subjected to electrophoresis without any previous treatment. 
(B) Recombinant PhNobl was incubated with in vitro transcribed 
cleavage substrate as described for (A). Reactions were performed in 
the presence (5mM) or absence (- mM) of manganese ions. 



resonances of full-length PhNobl (deposited in 
BioMagResBank (BMRB) — accession number 17595) 
(48). Chemical shift-based prediction of secondary struc- 
ture elements using the programs PECAN (70) and 
TALOS+ (52) revealed an alternating pattern of five 
P-strands and six a-helices in the first 120 residues of 
PhNobl and the presence of 2-3 short p-strands in the 
C-terminal part of the protein (48). The alternating 
pattern of five P-strands and 6 a-helices in the 
N-terminal portion of PhNobl is in agreement with the 
presence of a PIN domain in PhNobl as predicted based 
on sequence homologies. The Ca and CP chemical shifts 
of the four conserved cysteine residues in the C-terminal 
part of PhNobl indicate that these cysteines are in their 
reduced form but are compatible with the complexation of 
a Zn 2+ -ion (71) by these residues. Assignments for the 
backbone amide groups of PhNobl Zn and PhNobl PIN 
were obtained by transferring the assignment from 
PhNobl due to the close similarity of their 'H, 
15 N-HSQC-spectra. 
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Figure 3. PliNobl exhibits endonucleolytic activity. (A) Secondary 
structure prediction of the hairpin constructs used. (B) Radiolabeled 
in vitro transcripts of long (HP/L) or short (HP/S) hairpin constructs 
with loops containing sequences of the 3'-end of 16S rRNA were 
incubated with PhNobl to analyze endonuclease activity. Analysis 
was performed as described for Figure 2. 



A { H}- and N-heteronuclear single quantum 
correlation (HetNOE)-experiment demonstrated increased 
flexibility of amino acids 120-125 (Figure 4B) in agree- 
ment with the presence of a flexible linker between the 
N-terminal and the C-terminal domain. The backbone 
chemical shifts of the linker amino acids have random 
coil values. The isolated N-terminal (amino acids 1-120) 
and C-terminal fragments (amino acids 121-165) yielded 
'H, 15 N-HSQC-spectra typical for well folded, globu- 
lar proteins. The overlay of the HSQC-spectra for 
these two domains and for the full-length PhNobl 
showed that the HSQC spectrum of PhNobl is essentially 
the sum of the spectra of the two isolated domains (Figure 
4A). This indicates that the structures of the PIN and the 
zinc ribbon domains are highly similar in isolation and in 
the context of the full-length protein. Thus, PhNobl 
consists of two structurally independent domains con- 
nected by a flexible linker (amino acids 119-126). 

Solution structure of PhNobl 

The final solution structure of PhNobl was calculated 
using 228 torsion angle constraints, 18 distance con- 
straints to ensure the presence of a tetrahedrally 
coordinated zinc ion (57) and 3542 nuclear Overhauser 
effect spectroscopy (NOESY)-based distance constraints. 
The direct analysis of the hydrogen bonding patterns in 
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Figure 4. PhNobl consists of two structurally independent domains 
connected by a flexible linker. (A) Overlay of 15 N-HSQC spectra of 
PhNobl (black), PhNobl PIN (red) and PhNobl Zn (blue) with the as- 
signments for backbone amid groups of full-length PhNobl indicated. 
The two isolated domains exhibit virtually identical chemical shifts for 
backbone amide resonances in comparison with the full-length protein. 
(B) Plot of the measured {'H}- and 15 N-hetNOE values for full-length 
PhNobl against residue number. HetNOE values <0.5 point to an 
increased flexibility of the respective residues. Besides residues at 
N-termini and C-termini the amino acids 118-126 corresponding to 
the predicted domain boundary between PIN and zinc ribbon domain 
show strongly reduced hetNOE values. The position of the putative 
flexible interdomain linker is indicated by the black arrow. 



2D and 3D-long range HNCO experiments on the two 
isolated domains where signal overlap is reduced yielded 
a total of 34 hydrogen bonds (Figure 5A). The presence of 
the same hydrogen bonds in the full-length protein was 
verified by analyzing 3D-NOESY-HSQC and H/ 
D-exchange experiments (Figure 5B). The structure calcu- 
lation resulted in a set of converged structures with low 
target functions and reasonable stereochemistry even in 
the absence of the hydrogen bonding constraints derived 
from the HNCO-experiments. However, hydrogen 
bonding constraints were included in the final rounds of 
structure calculations (Table 2; the 20 best structures are 
deposited in PDB — 21cq). 

In agreement with the presence of a flexible 
interdomain linker and the structural independence 
of the two domains, no NOEs were found between 
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Figure 5. Solution structure of PhNobl. (A) Overlay of a long range 2D-H(N)CO-spectrum (red) and a standard 2D-H(N)CO spectrum (black) 
recorded for the isolated zinc ribbon domain for direct identification of hydrogen bond donor and acceptor groups. Identified hydrogen bonds are 
indicated. (B) Secondary structure and hydrogen bonding scheme for the zinc ribbon domain as derived by the long range H(N)CO-experiment. 
Directly identified hydrogen bonds are indicated by bold-dashed lines between donor hydrogen and acceptor CO group. Hydrogen bonds identified in 
the isolated domain are present in full-length PhNobl as indicated by the corresponding NH-NH and NH-Hot-NOEs (double-edged grey arrows) 
and the observed protection patterns in an H/D-exchange experiment. Positions of slowly exchanging NH-protons in the full-length protein are 
marked by orange (NH present after 6h) or red (NH present after 24 h) dots. Fast exchanging NH-protons are colored in light-blue. (C) Bundle of 
the 20 structures with the lowest target function aligned using the backbone heavy atoms of residues 11-117 of the PIN domain. The PIN domain is 
colored dark blue, the zinc ribbon domain is colored gold. (D) Overlay of the PIN domain (residues 11-117) taken from the 20 structures with the 
lowest target function in cartoon representation. P-strands are colored magenta whereas a-helices are colored orange. (E) Overlay of the zinc ribbon 
domain (residues 129-159) taken from 20 structures with lowest target function in cartoon representation. The secondary structure elements are 
annotated. The position of the bound Zn + -ion assuming tetrahedral coordination is indicated (blue sphere). The four metal-coordinating cysteine 
residues are shown as sticks with their sulfur atoms colored in yellow. 



amino acid residues of the two domains despite care- 
ful manual checking. Accordingly, the orientation of 
the two domains with respect to each other is not 
defined by our NMR data (Figure 5C). However, the 
structures of the two domains themselves are well 
defined as judged by the backbone heavy atom root 
mean square deviations (RMSDs) of 0.51 A and 0.52 A 
for the PIN domain (residues 9-118) and the zinc rib- 
bon domain (residues 127-159), respectively (Figure 5D 
and E). 

The structure of the PIN domain (Figure 5D) reveals a 
central five-stranded P-sheet with all five strands in a 



parallel orientation. The order of strands in this p-sheet 
is P5t— P4j— (31 1"— P2j— P3^. The loops connecting the 
P-strands with each other include either one or two 
a-helices of different lengths. Helix al is located between 
strand pi and p2 and helix a6 between P4 and P5, whereas 
two a-helices are inserted between P2 and P3 and P3 and 
P4, respectively. Two sets of three a-helices each (al-a3 
and a4-a6, respectively) form a helical bundle on either 
side of the 5-stranded P-sheet. In each of these bundles, 
one helix (al and a4, respectively) is packed perpendicular 
against the other two, which are either oriented parallel 
(a5 and a6) or antiparallel with respect to each other. 
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Table 2. NMR and refinement statistics for the 20 lowest energy 
structures of PhNobl 



NMR constraints 

Total NOE distance constraints 3542 

Intraresidue 651 

Sequential 931 

Medium range 639 

Long range 1321 

Hydrogen bond 47 

Zinc coordination 18 

Dihedral restraints (Talos+) 228 
Structural statistics 

Target function value _ 3.96 ± 0.44 

Distance constraints (A) 0.013 ± 0.001 

Maximum distance constraint violation (A) 0.34 ± 0.09 

Dihedral angle constraints (°) 0.61 ± 0.08 

Maximum dihedral angle violation (°) 3.90 ± 0.96 

Amber energies total [kcal/mol] — 4750 ± 239 
RMSD from idealized geometry 

Bond length (A) 0.0147 ± 0.0001 

Bond angles (°) 1.82 ± 0.03 

Residues in most favored regions 84.9% 

Residues in additionally allowed regions 13.6% 

Residues in generously allowed regions 1.1% 

Residues in disallowed regions 0.5% 
RMSD values (A) 

Backbone (residues 9-118) 0.51 

Backbone (residues 127-159) 0.52 

Heavy atoms (residues 9-118) 1.19 

Heavy atoms (residues 127-159) 1.18 



The overall twist of the 5-stranded parallel (3-sheet is ap- 
proximately 160°. The 5-stranded parallel P-sheet, the 
strand order of the sheet and its decoration with 
a-helices concur with the expected consensus structure of 
PIN domains (Figure 5D) (22,72,73). 

The second domain contains a three-stranded antiparal- 
lel P-sheet (Figure 5E). The topology of this P-sheet can be 
described as: P3t-PH-P2t- The four zinc-coordinating 
cysteine residues are located at the C-terminus of pi, in 
the loop connecting pi and P2 and in the long irregular 
loop connecting P2 and P3 and brought into close spatial 
proximity even when the structure is calculated without 
constraints between the sulfur atoms of the four cysteine 
side chains and the putative zinc ion. 

Mapping of the catalytic center and the substrate 
binding site of PhNobl 

The endonucleolytic cleavage activity of PhNobl is de- 
pendent on the presence of divalent manganese ions 
(Figure 2). Manganese ions are paramagnetic and induce 
line broadening of NMR signals leading to disappearance 
of amide signals of amino acids in the vicinity of Mn 2+ - 
binding sites. To locate the active site of PhNobl for 
endonucleolytic cleavage, 'H, 15 N-HSQC-spectra of 
15 N-labeled PhNobl were recorded without and with 
addition of up to 50 uM MnCl 2 . At a concentration of 
10 uM, Mn 2+ signals for the amino acids D12, S13, S14, 
V15, L78, K80, A81, 183, V85, L86, D99, D100, Y101 and 
N102 could no longer be detected (an example is given in 
Figure 6A). When mapped on the surface of the protein 



3D structure, these residues are clustered on a defined 
surface of the PIN domain of PhNobl, which is in agree- 
ment with the presence of a single, well-defined 
Mn~ + -binding site (Figure 6B and C). Three of the 
affected aspartates (D12, D82 and D100; Figure 6B) are 
conserved in all archaeal and eukaryotic sequences of 
Nobl (Figure 1). The 3D structure of the domain brings 
the aspartate side chains of D12 at the end of pi, D82 at 
the N-terminus of a5 and D99 and D100, which follow P5 
in close spatial proximity. Two of these amino acids, 
namely D12 and D82 were already implicated as necessary 
for D-site cleavage activity by mutational studies (13,26) 
in ScNobl (scD15 and scD92). Notably, no line broaden- 
ing was observed for signals of amino acid residues in the 
C-terminal zinc ribbon domain, which is apparently not 
involved in Mn 2+ -binding. 

To map amino acids possibly involved in substrate 
binding we titrated 15 N-labeled PhNobl with RNAs of 
different lengths (10 and 23 nt, PhNlong23 and 
PhNshortlO, see 'Materials and Methods' section) con- 
taining the D-site sequence as well as with single 
stranded oligo-U-RNA (U 9 ). We monitored the result- 
ing chemical shift changes in 'H, 15 N-HSQC-spectra 
(Figure 6B). Only limited and gradual chemical shift 
changes were observed upon addition of increasing 
amounts of all RNAs either in the absence or presence 
of Mg 2+ as divalent cation. This is typical for the forma- 
tion of low-affinity RNA-protein complexes in fast 
exchange on the NMR time scale. Notably, all affected 
signals (S13, S14, K70, S79, D99, D100, V103, L109 and 
R115-K118) correspond to amino acids in the N-terminal 
PIN domain. No chemical shift changes were observed for 
the C-terminal zinc ribbon domain despite their strongly 
basic amino acid composition. Mapping of the affected 
residues on the 3D structure of the PIN domain reveals 
that they line the manganese binding pocket as expected 
for a genuine substrate binding site (Figure 6C). 

The fact that U 9 induces similar albeit smaller chemical 
shift changes as the genuine substrates for endonucleolytic 
cleavage indicates that substrate recognition by the PIN 
domain itself is only marginally specific for short RNA 
fragments. The low affinity for the short substrates 
observed in the NMR-experiments is in agreement with 
the Kd estimated by electrophoretic mobility shift assays 
for PhNlong23, which is larger than 30 uM 
(Supplementary Figure S4). However, the longer substrate 
used in the cleavage assays (PhNlong90) is bound with a 
higher affinity reflected by a dissociation constant of 
about 0.7 uM estimated from electrophoretic mobility 
shift assays (Supplementary Figure S4). The same holds 
true for the isolated PIN domain, which binds with a 
higher affinity reflected by a lower dissociation constant 
(K D = 5 uM) to PhNlong90 than to PhNlong23 
(_AT D >60uM; Supplementary Figure S4). However, the 
increase of the dissociation constant by a factor of 10 
for PhNlong90 and by a factor of 2 for PhNlong23 by 
omitting the zinc ribbon domain suggests that the latter 
contributes to the interaction in vitro, but to a smaller 
extent for the shorter RNA. The RNA Phlong90, 
however, is unfortunately unsuitable for NMR-titration 
experiments, due to its size. 



3268 Nucleic Acids Research, 2012, Vol. 40, No. 7 



5 N ppm 



124 



126 





i D PhNobl D12N S79A D100N R115A 



H ppm 



1 H p pm 



9.0 



8.5 



1 H ppm 



Iff Iff Iff Iff Iff 
ry o' o' <y o' tf <y o' o' ty o' o' ty p' c 



- - - 



2^ 



Figure 6. Identification of the active center and the substrate binding site of PhNobl. (A) Overlay of a selected section of the 'H, 15 N-HSQC spectra 
of PhNobl in the absence (black) or presence (red) of 10 uM Mn 2+ -ions. The signal assignments for affected amino acids are given. (B) Overlays of 
selected regions of 'H, 15 N-HSQC spectra of PhNobl in the absence (black) and in the presence of 1.2 (red) or 2.4 (light blue) equivalents of 
substrate RNA. Affected signals are indicated. (C) Mapping of the effects of Mn 2+ or substrate RNA addition on the structure of the PIN domain. 
Amino acids affected by Mn 2+ are colored red, those affected by RNA are colored yellow and those affected by both are shown in orange. The three 
indicated aspartates are shown in stick representation. D82 (asterisks) cannot be analyzed due to signal overlap in the HSQC-spectrum but both 
adjacent amino acids are influenced by Mn 2+ . (D) Wild-type or mutant PhNobl were recombinantly produced, purified and incubated with 
radiolabeled in vitro transcripts as described for Figure 2. The expected migration of the cleavage products is indicated as in Figure 2. 
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Figure 7. The interaction of PhNobl with helix 40. (A) Secondary 
structure prediction of the 26-mer helix 40 RNA used for the binding 
studies. (B) Binding of recombinant PhNobl to helix 40 RNA was 
probed by gel filtration analysis. The gel filtration profile 
determined at 260 nm of helix 40 is shown in black, the corresponding 
profiles of a approximately 1:1 and an approximately 1:4 mixture 
of helix 40 with PhNobl are shown in red and blue, respectively. 
(C) PhNobl was incubated at indicated concentrations with 5'-labeled 



The PIN domain is required for RNA cleavage 

To further test the functional relevance of the amino 
acids identified as part of the catalytic center and the sub- 
strate binding site (Figure 6C), some of these residues in 
the PIN domain were mutagenized. It was previously 
shown that the D15N mutation in yeast abolished Nobl 
cleavage activity (13). The corresponding residue in 
P. horikoshii Nobl is the aspartate at position 12. 
Mutagenesis of this residue (D12N) resulted in a 
complete loss of the catalytic activity of PhNobl 
(Figure 6D). The mutations of serine 79 (S79A), which 
showed the largest chemical shift change upon substrate 
addition, did result in a significant reduction of cleavage 
which is in-line with its proposed role for substrate 
binding. When analyzing the activity of the PhNobl 
carrying a mutation of D100 to N or R115 to A 
(PhNobl D10 oN and PhNobl R11 5AX we observed unspecific 
degradation of the substrate. This is in agreement 
with the observation that these residues are involved 
in substrate binding (Figure 6), and the data suggest 
that these residues are playing a role in the correct 
positioning of the substrate with respect to the catalytic 
center. This indeed explains the loss in PhNobl specifi- 
city for the conventional cleavage site. In summary, 
the results obtained for the different mutants sup- 
port the substrate binding site mapped by NMR 
spectroscopy. 

The C-terminal zinc ribbon domain is sufficient for 
binding to helix 40 of the SSU RNA 

It was recently described that yeast Nobl cross-links to 
helix 40 of 20S pre-rRNA (21). However, the addition of 
short substrate RNA did not induce chemical shift 
changes in the C-terminal zinc ribbon domain although 
this domain harbors the majority of basic amino acid 
residues of PhNobl. Thus, we speculated that the 
C-terminal domain might be important for binding to 
other rRNA elements apart from the D-site and might 
be involved in anchoring PhNobl in the nascent riboso- 
mal subunit. To substantiate this notion, we used a chem- 
ically synthesized helix 40 RNA from P. horikoshii 
(Figure 7A) to analyze the ability of the heterologously 
produced protein to interact with RNA in vitro. Using 
analytical gel filtration, we observed a decrease in the re- 
tention time of helix 40 RNA (Figure 7B; black line) in the 



Figure 7. Continued 

synthetic P. horikoshii helix 40 rRNA and binding was analyzed by 
detection of the electrophoretic mobility shift on a native polyacryl- 
amide gel. (D) The results of the fluorescence anisotropy measurements 
with full-length PhNobl (blue), the PIN domain (PhNobl PIN , green) 
and the Zn ribbon domain (PhNobl Zn , red) to the H40 from 
P. horikoshii (square), a 5-bp stem-loop containing variant of PhH40 
(circle) and the stabilized H16 from S. cerevisiae 18S RNA (diamond) 
are shown. (E) The chemical shift perturbations caused by 1.2 equiva- 
lents of helix 40 on the 15 N-HSQC-spectrum of the isolated zinc ribbon 
domain were recorded and the spectra of the protein in the absence 
(black) or presence of RNA are shown (red). The amino acid residues 
with the largest chemical shift changes upon RNA binding are labeled. 
(F) Residues labeled in (E) are colored red on the structural cartoon of 
the isolated zinc ribbon domain. 
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Table 3. Dissociation constants determined by anisotropy 
measurements 



Protein 


RNA 


K D [uM] 


PhNobl 


PhH40 


2.0 ± 0.1 


PhNoblzn 


PhH40 


2.3 ± 0.1 




PhH40 sho „ 


3.1 ± 0.2 




ScH16 


1.5 ± 0.2 


PhNobl PIN 


PhH40 


>1000 



presence of one (red) or four (blue) equivalents of 
full-length PhNobl. In contrast, oligoU (U 9 ) shows no 
shift in its retention time in similar experiments. This is 
consistent with the concentration-dependent formation of 
a helix 40 RNA/full-length PhNobl complex observed by 
gel-shift analysis (Figure 7C). 

To analyze whether the zinc ribbon domain alone is 
sufficient to mediate the interaction with helix 40 which 
was observed in the gel shift and gel filtration experiments 
with the full-length protein, the interaction of the helix 
40 RNA (PhH40) with the isolated domains and the 
full-length protein was analyzed by fluorescence anisot- 
ropy measurements using 5'-fluoresceine-labeled RNA 
(Figure 7D). For full-length PhNobl, we observed a K 0 
value of 2.0 ± 0.1 uM (Table 3). While essentially no inter- 
action was observed between PhH40 and PhNobl PIN , the 
dissociation constant for the isolated zinc ribbon domain 
is similar to that of the full-length protein (Figure 7D top, 
Table 3). The dissociation constant observed for the zinc 
ribbon domain is sensitive to stem length as the dissoci- 
ation constant increased slightly from 2.3 to 3.1 uM, when 
the 9 bp stem of PhH40 (Figure 7A) was shortened to a 
5 bp stem of PhH40 s h or t (Figure 7D, bottom, square versus 
circle). In turn, the dissociation constant observed for the 
interaction of the zinc ribbon domain and the stabilized 
H16 from S. cerevisiae containing 10 bp is 1.5 uM, which is 
moderately lower than for PhH40 (Table 3 and Figure 7D, 
bottom, diamond). 

The interaction between the isolated zinc ribbon 
domain and PhH40 was additionally analyzed by titration 
of the RNA and monitoring the chemical shift changes in 
the 'H, 15 N-HSQC-spectra. Changes in the NMR spec- 
tra were only observed until an RNA-protein ratio of 
1:1 was reached (Figure 7E). Furthermore, there were 
no gradual chemical shift changes but new signals 
appeared in the spectra upon addition of the RNA, 
whereas signals belonging to the free protein disappeared. 
Thus, the zinc ribbon domain forms a specific complex 
with PhH40 in slow exchange on the NMR-time scale in- 
dicative of a K 0 for the RNA-protein interaction is ap- 
proximately <luM, which is in agreement with the 
dissociation constant measured by fluorescence anisotropy 
(Figure 7D). The amino acids affected by by PhH40 
addition include many of the basic residues of the zinc 
ribbon domain, parts of the interdomain linker and the 
C-terminal portion of the domain; they form a continuous 
surface on the 3D structure of the domain located at the 
flat side of the P-sheet (Figure 7F). 



DISCUSSION 

D-cIeavage of the SSU rRNA is conserved 

Yeast Nobl has been shown to mediate the final pro- 
cessing of the SSU rRNA in the nascent small ribosomal 
subunit by endonucleolytic cleavage of the 20S pre-rRNA. 
Remarkably, this enzyme is phylogenetically conserved in 
all eukaryotes and most archaea (Figure 1). All identified 
Nobl homologs contain a PIN and a zinc ribbon domain. 
However, sequences of the eukaryotic kingdom contain an 
extended PIN domain-intrinsic loop and a C-terminal ex- 
tension following the zinc ribbon domain (Figure 1). The 
length, sequence and the predicted secondary structure of 
the PIN domain-intrinsic loop vary significantly between 
the three different eukaryotic lineages (Supplementary 
Figure S2). In all cases, the N-terminus and C-terminus 
of the intrinsic loop contain a number of conserved acidic 
and hydrophobic residues suggesting that it might play a 
role in additional interactions with basic ribosomal or 
pre-ribosomal proteins not occurring or not required in 
archaea. The C-terminal extension of the eukaryotic 
Nobl proteins contains a number of conserved charged 
and predominantly basic residues as well as conserved 
hydrophobic amino acids consistent with a role of the ex- 
tension in RNA binding. The general lack of conservation 
in the eukaryotic extension segments among different eu- 
karyotic lineages suggests that these extensions are 
required either for very specialized functions of Nobl or 
a fine tuning of Nobl's functionality to the specific envir- 
onment in different organisms. The archaeal Nobl seems 
to represent the minimal functional core required for its 
function in ribosome biogenesis. 

Despite its generally smaller size compared to its 
eukaryotic counterparts and to the biochemically 
characterized yeast protein in particular, the P. horikoshii 
Nobl homolog specifically cleaves RNA substrates con- 
taining a sequence surrounding the D-site of P. horikoshii 
SSU rRNA in an endonucleolytic manner specifically at 
the predicted site (Figures 2 and 3). As reported for the 
yeast protein, the reaction depends on the presence of 
Mn 2+ ions, but the in vitro cleavage reaction appears to 
be much more efficient in comparison with the yeast 
protein (Figure 2) (13,15) even when carried out at 
37°C, which is not the optimal temperature for a thermo- 
philic enzyme. The presence of Nobl homologs in archaea 
and the observed cleavage specificity in vitro strongly 
suggest that the 3' terminal maturation of the 
pre-SSU-rRNA by processing at the D-cleavage site is 
conserved. 

The structural basis for Nobl function 

The three-dimensional structure of PhNobl reveals a PIN 
domain with a canonical structure and a zinc ribbon 
domain with a three-stranded P-sheet (Figure 5). The 
two domains are structurally independent from one 
another and connected by a flexible linker suggesting 
that the orientation of the two domains with respect to 
each other is flexible and can be influenced by interactions 
with ribosomal RNA, ribosomal proteins or other 
ribosome assembly factors. 
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Figure 8. Comparison of the determined structure with known PIN 
and Zn-binding domains. (A) PIN and Zn ribbon domains from the 
PDB were structurally aligned with the corresponding domains of 
PhNobl using the MUSTANG plugin in YASARA (www.yasara 
.org). The output consists of the alignment from which the sequence 
similarity and the Cot RMSD [A] were calculated, which are presented 
for the PIN domains (full circle) or the Zn ribbon domains (open 
circle). (B) The overlay of the structures of PhNobl Pm (gray) and 
lo4w (secondary structures are colored) is shown as a cartoon repre- 
sentation. (C) Representative structure of the deposited NMR- 
ensembles of PhNobl ZN , 2con, 2klp (structurally alike 3g9y) and 
lvd4 are shown with the same color code as in B. For all structures 
the same orientation according to the structural alignment is shown. 



The PIN domain is structurally most similar to the ca- 
nonical PIN domain of AF0591 annotated as succinyl- 
CoA synthetase (3-chain and not containing a Zn-ribbon 
(Figure 8A, PDB:104W) (74) from Archaeoglobus fulgidus 
with an Ca RMSD of 1.53 A and a sequence similarity of 
38.5%. The P-sheet of both the proteins is very similar 
although P3 is somewhat shorter in AF0591. With 
respect to the overall fold AF0591 contains an additional 
helix between ocl and P2, but does not contain a4 present 
in PhNobl PIN (Figure 8B). Unfortunately, the function of 
this factor is currently unknown and thus, no functional 
comparison of the two proteins is possible. 

The insertion point of the eukaryote-specific loop in the 
PIN domain would be located between a-helix a5 and 
P-sheet p4. Remarkably, the loop does not share 



significant sequence similarities between fungi, mammals 
or plants, although the region is generally enriched in 
negatively charged amino acids. Even within these 
groups only a moderate similarity between the sequences 
restricted to the N- and C-terminal portion in fungi and 
mammals is observed (Supplementary Figure S2). In 
plants, the loop is significantly enlarged and in contrast 
to the other sequences two clusters of positive charges can 
be identified (Supplementary Figure S2). Up to now, no 
other PIN domains that carry an extension with respect to 
the canonical PIN domain fold at the same position as the 
eukaryotic Nobis have been described at structural level. 

The topology of the zinc ribbon domain is unusual 
(Figures 5 and 7). Only two other zinc ribbon domains 
with a similar topology were identified in the databases: 
the human general transcription factor transcription 
factor name (TFIIE) (Figure 8C, PDB:1VD4) (75), and 
the yet undescribed NMR structure of the isolated zinc rib- 
bon domain of Nobl from Mus musculus (PDB:2CON). 
Both exhibit a Ca RMSD of approximately 1.1 A to 
the zinc ribbon domain of PhNobl. However, the loop 
of PhNobl ZN between P2 and P3 has a different struc- 
tural arrangement in TFIIE with one N-terminal 
helix and a C-terminal additional P-hairpin (Figure 8C). 
Remarkably, the isolated mouse Nobl domain contains 
two additional short P-strands (p4, P5) at its C-terminus 
forming a P-hairpin unconnected to the central 
three-stranded P-sheet and a long disordered C-terminal 
tail (Figure 8C). This additional P-hairpin represents the 
N-terminal part of the C-terminal extension observed 
only in eukaryotic Nobl proteins in comparison 
with the archaeal proteins (Figure 1 and Supplementary 
Figure S2). The entire C-terminal extension is in general 
more positively charged and shows on its N-terminus a 
moderate conservation among all eukaryotic species, 
which according to the structure (2con) forms the 
p-hairpin structure (Figure 8C and Supplementary 
Figure S2). 

In addition, the two zinc ribbon domains of the serine/ 
arginine (SR)-rich Ran-binding protein ZRanB2 (Figure 
8C, PDB:3G9Y) (76,77) are similar to the PhNobl zinc 
ribbon domain with respect to the structure of the first two 
P-strands and the spatial orientation of the zinc binding 
residues. However, they contain a second p-hairpin not 
connected to p2 and P3 in contrast to P3 of the PhNobl 
zinc ribbon domain. The RMSD between the 24 equiva- 
lent Ca positions is approximately 1.6 A. Interestingly, for 
this domain an RNA-protein complex has been 
crystallized that shows how single stranded RNA is 
recognized in a sequence-specific manner. The RNA is 
bound to the outer edge of p-sheet P2 and by hydrogen 
bonds and Tr-stacking interactions to basic and aromatic 
residues, respectively, in the loop connecting P2 and P3. 
However, the RNA-binding residues are not conserved 
between the ZRanB2 zinc fingers and PhNobl. In particu- 
lar, the loop connecting P2 and P3 in PhNobl does not 
contain many basic and hydrophilic amino acids and lacks 
the tryptophan observed to stack between two guanines 
in ZRanB2. Therefore, it is an unlikely candidate for 
RNA interactions in the PhNobl zinc ribbon domain. 
This already suggests that the RNA-recognition mode of 
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Figure 9. Model visualizing the possible positioning of Nobl on the 
small subunit of the ribosome. Shown is the ribosomal surface (gray, 
pdb: 3o2z) (82) with H40 highlighted in green and site D in yellow. One 
representative structure of the PhNobl NMR ensemble is shown as 
overlay in a cartoon representation. The PIN domain is oriented such 
that the conserved aspartates of the active site (D12, D82 and D100) 
point to the last nucleotide of the 3'-end of the rRNA. The Zn ribbon 
domain is packed with its P-sheet surface against H40. The linker 
between PIN and Zn ribbon domain is in a relaxed and not fully 
extended conformation. The inset on the upper right side shows the 
positioning in the context of the small ribosomal subunit. 

the PhNobl zinc ribbon domain differs from that of the 
ZRanB2 zinc finger despite their structural similarities. 

The two domains of PhNobl are not only structurally 
independent but also show a clear division of function. 
The PIN domain contains the catalytically active center 
required for endonucleolytic substrate cleavage as mapped 
in Mn 2+ -titration experiments and it binds single stranded 
RNAs in the vicinity of the active center. The binding to 
single-stranded RNA is rather weak, but we observe a 
binding preference for larger substrate RNAs. In line, mu- 
tations in amino acid residues involved in substrate 
binding either diminish the cleavage activity or result in 
the loss of sequence specificity of the cleavage reaction 
(Figure 6). Thus, the experimentally observed cleavage 
specificity is probably not only the result of the inter- 
actions of the substrate with the PIN domain alone. 

The isolated basic zinc ribbon domain is sufficient to 
bind a RNA helix such as helix 40 of the SSU RNA 
with a micromolar affinity (Figure 7), but it has a signifi- 
cantly lower affinity for single-stranded substrate RNAs. 
These observations are consistent with the in vivo 
cross-link results observed for yeast Nobl (21) and show 
that this interaction is conserved even in archaea. This 
suggests an anchoring function for the zinc ribbon 
domain in Nobl. However, our results also suggest that 
additional factors influence the positioning of Nobl as the 
zinc ribbon domain itself does not exhibit a high specificity 
for helix 40 in vitro (Figure 7). 

A combination of a high-affinity RNA-anchoring 
domain connected to a catalytic domain with a low 
affinity for its target RNA by a flexible linker has also 
been suggested for other RNA remodeling enzymes, 
such as the RNA helicases YxiN (78,79), Hera (80) and 



members of the RNase III family (see, for example, 
Ref. 81). It is thought that in some cases, the binding of 
additional protein factors can modulate the interdomain 
orientation (78). The specificity for the substrate is in these 
cases defined by the RNA-binding module which recog- 
nizes a primary RNA-binding site with high affinity. This 
then restricts the catalytic domain to targets in the spatial 
vicinity, which are bound with much lower affinities. 
Indeed, the structural dimensions of PhNobl perfectly 
match the distance between helix 40 and site D in the ma- 
ture small ribosomal subunit of S. cerevisiae (Figure 9) 
(82). As Nobl is recruited to the preribosomal complex 
within the nucleolus but site D cleavage occurs only in the 
cytosol (11), three alternative explanations for the regula- 
tory mechanism can be proposed: (i) the distance between 
Helix 40 and site D might be significantly larger in earlier 
(nuclear) preribosomal complexes and structural changes 
in the complex, e.g. triggered by release of export factors 
or ribosomal protein binding, might deliver helix 
40-bound Nobl into close proximity of the cleavage site; 
(ii) additional interactions with proteins or RNA might 
protect the D-cleavage site and restrict access of Nobl 
before their release after export; or (iii) other cofactors 
might keep Nobl in an deactivated form in the nucleus 
or activate its catalytic activity in the cytoplasm for 
D-cleavage. In future, it will be important to analyze if 
and how RNA and ribosomal protein and ribosome 
assembly factor binding influences the relative domain 
orientation in Nobl and the activation of its 
endonucleolytic activity. 
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